| Email an
creator |
Grieb, Melanie
| date |
2005-08-03
| | | description |
59 pages
| |
It is expensive to experimentally determine the sequence, structure
and function dependencies of a protein. Therefore, information is
transferred from known to unknown proteins using the recognition of
similar protein characteristics. Due to the decreasing cost of
protein synthesis (currently 1.5$ per nucleotide, 0.1$ per
nucleotide in the near future), the future goal is to use this
knowledge to completely predict the function of a specific protein
sequence for a non-experimental de novo design of proteins. Family
classification plays a key role in finding the solution to that
problem. Similarities in fold but not sequence are less likely to
reveal common function than sequence similarity, which generally
infers common structure. The family classification approach used by
protein family databases is based on sequence similarity. Our
research is based on the different approach of classifying proteins
in families using amino acid annotations. In previous work, the
transfer of annotations to different sequences in a protein family
and the family classification were done manually. The duration of
manual analysis was 3 months per database. In this thesis ANACIN, a
software which automates annotation transfer and family
classification, was created. With this software, the annotation
accuracy could be improved compared to manual annotation. The
analysis of the results of automated protein family classification
led to the discovery of new protein families. The time for complete
analysis was reduced to two days of computation and one day for the
manual interpretation of the results.
| format |
application/pdf
| | 1147113 Bytes | |